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DETAILED ACTION 

1. This action is responsive to an amendment to application 09/669,680 filed on 06/22/2004. 

2. Claims 1-29 are pending in the case. Claims 1, 8, 15, 20, and 23 are independent claims. 
Claim 30 has been cancelled. Claims 1 and 23-29 have been amended. 

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

3 Claims 1-30 are rejected under 35 U.S.C. 103(a) as being unpatentable over Lantrip 
et al. (USPN 6,298,174 Bl— filing date 10/15/1999), hereinafter Lantrip, further in 
view of Ruocco et al. (USPN 5,864,855— filing date 2/26/1996), hereinafter Ruocco. 

4. Regarding independent claim 1, Lantrip discloses a method of clustering documents in 
datasets (in col 2, lines 39-42, document vectors are arranged into clusters) comprising: 
clustering first documents in a first dataset to produce first document classes; (in col. 2, 
lines 39-42, document vectors are arranged into clusters), and creating centroid seeds 
based on said first document classes (in col. 2, lines 43-45, the invention finds centroids). 
However, Lantrip fails to disclose clustering second documents in a second dataset using 
said centroid seeds. However, in col. 14, lines 10-45 of Ruocco, Ruocco discloses in the 
claim processing in parallel second datasets based on cluster information from previous 
cluster vectors (see col. 14, lines 28-30) in order to gain the benefit of information from 
previous clusters to improve analysis of subsequent datasets. It would have been obvious 



Application/Control Number: 09/669,680 Page 3 

Art Unit: 2178 

to one of ordinary skill in the art at the time of the invention to use the information 
contained in the centroid seeds from Lantrip for subsequent datasets as in Ruocco in 
order to improve analysis of subsequent datasets. 

5. Regarding dependent claim 2, Lantrip and Ruocco fail to disclose that said first dataset 
and said second dataset are related. However, it was notoriously well known in the art at 
the time of the invention that if one intends to process a dataset based on the results of 
previously processing another dataset, the datasets should be related in order for the 
results to be meaningful. It would have been obvious to one of ordinary skill in the art at 
the time of the invention to have the first and second dataset be related in order for the 
results to be meaningful 

6. Regarding dependent claim 3, Lantrip discloses that the clustering of said first 
documents in said first dataset comprises: forming a first dictionary of most common 
words in said first dataset (in col. 2, lines 30-40, Lantrip creates a database based on the 
dataset, which would include the common words); generating a first vector space model 
by counting, for each word in said first dictionary, a number of said first documents in 
which said word occurs (in col. 2, lines 35-42, Lantrip creates a vector space model); and 
clustering said first documents in said first dataset based on said first vector space model 
(in col. 2, lines 39-42, Lantrip carries out clustering). 

7. Regarding dependent claim 4, Lantrip fails to disclose a method further comprising 
generating a second vector space model by counting, for each word in said first 
dictionary, a number of said second document in which said word occurs. However, 
Ruocco, in col. 14, lines 20-35, discloses generating such a vector space model for 
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multiple document sets in order to aid in the clustering analysis of the document sets. It 
would have been obvious to one of ordinary skill in the art at the time of the invention to 
generate a second vector space model in the manner of Ruocco in Lantrip's invention in 
order to aid in the clustering analysis of the document sets. 

8. Regarding dependent claim 5, Lantrip discloses that said creating of said centroid seeds 
comprises: classifying said second vector space model using said first document classes 
to produce a classified second vector space model (col. 2, lines 39-42, the vector space 
model is clustered); and determining a mean of vectors in each class in said classified 
second vector space model, wherein said mean comprises said centroid seeds (col. 2, 
lines 43-45, the centroid is the center of mass of the clusters). 

9. Regarding dependent claim 6, Lantrip and Ruocco fail to disclose a method further 
comprising forming a second dictionary of most common words in said second dataset; 
generating a third vector space model by counting, for each word in said second 
dictionary, a number of said second documents in which said word occurs; and clustering 
said documents in said second dataset based on said third vector space model to produce 
a second dataset cluster. However, this constitutes simply extending and repeating claim 
3 to a third dataset, and it was notoriously well known in the art at the time of the 
invention that it is useful to repeat steps for multiple datasets to take advantage of their 
utility for subsequent data. It would have been obvious to one of ordinary skill in the art 
at the time of the invention to extend the steps of claim 3 to a subsequent dataset to gain 
the benefits of the analysis for that dataset. 
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10. Regarding dependent claim 7, Lantrip discloses in col. 2, lines 39-45 that clustering of 
said documents in said dataset using said centroid seeds produces an adapted dataset 
cluster. However, Lantrip fails to disclose the use of multiple datasets and that the 
method further comprises comparing classes in said adapted dataset cluster to classes in 
said second dataset cluster; and adding classes to said adapted dataset cluster based on 
said comparing. However, in col. 4, lines 61-67, Rocco deals with comparing multiple 
dataset clusters in order to obtain more information about the relative status of the 
datasets. It would have been obvious to one of ordinary skill in the art at the time of the 
invention to compare multiple dataset clusters in order to obtain more information about 
the relative status of the datasets. 

1 1 . Regarding independent claim 8, it is a system that carries out the method of claim 1 , 
and is rejected under similar rationale. 

12. Regarding dependent claim 9, it is a system that carries out the method of claim 2, and 
is rejected under similar rationale. 

13. Regarding dependent claim 10, it is a system that carries out the method of claim 3, and 
is rejected under similar rationale. 

14. Regarding dependent claim 11, it is a system that carries out the method of claim 4, and 
is rejected under similar rationale. 

15. Regarding dependent claim 12, it is a system that carries out the method of claim 5, and 
is rejected under similar rationale. 

16. Regarding dependent claim 13, it is a system that carries out the method of claim 6, and 
is rejected under similar rationale. 
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17. ' Regarding dependent claim 14, it is a system that carries out the method of claim 7, and 

is rejected under similar rationale. 

18. Regarding independent claim 15, it is essentially analogous to claim 1 except that it 
involves the steps of generating a vector space model of said second documents, which 
Ruocco presents in col. 14, lines 27-36, and classifying said vector space model of said 
second documents using said first document classes to produce a classified vector space 
model, which Ruocco presents in col. 14, lines 27-36. It would have been obvious to one 
of ordinary skill in the art at the time of the invention to use the Ruocco form of vector 
space analyis in addition to the Lantrip material from the rejection of Claim 1 in order to 
enhance the classifications of the two datasets. The result would produce an invention 
that would serve to reject claim 15. 

19. Regarding dependent claim 16, it is a method that modifies claim 15 in the same 
manner that claim 3 modifies claim 1 and is rejected under similar rationale. 

20. Regarding dependent claim 17, it is a method that modifies claim 15 in the same 
manner that claim 4 modifies claim 1 and is rejected under similar rationale. 

21. Regarding dependent claim 18, it is a method that modifies claim 15 in the same 
manner that claim 6 modifies claim 1 and is rejected under similar rationale. 

22. Regarding dependent claim 19, it is a method that modifies claim 15 in the same 
manner that claim 7 modifies claim 1 and is rejected under similar rationale. 

23. Regarding independent claim 20, Lantrip discloses a method of clustering documents 
comprising: forming a first dictionary of most common words in a first dataset (col. 2, 
lines 30-35, Lantrip forms a first dictionary of common words); generating a first vector 
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space model by counting, for each word in said first dictionary, a number of said first 
documents in which said words occurs (col. 2, lines 35-40, Lantrip forms vectors); 
clustering said first documents in said first dataset based on said first vector space model 
to produce first document classes (col 2, lines 39-42, Lantrip forms clusters), and 
determining a mena of vectors in each class in said classified second vector space model 
to produce centroid seeds; (col. 2, lines 43-45, Lantrip forms centroid seeds) and 
clustering documents in a second datasets using said centroid seeds (col. 2, lines 45-57, 
Lantrip clusters using centroids). Lantrip fails to disclose generating a second vector 
space model by counting, for each word in said first dictionary,' and number of said 
second documents in which said word occurs and classifying said second documents in 
said second vector space model using said first document classes to produce a classified 
second vector space model. However, col. 14, lines 28-36 of Ruocco indicate that vector 
clustering analysis may involve multiple datasets in order to gain the benefit of 
information analysis from multiple sources. It would have been obvious to one of 
ordinary skill in the art at the time of the invention to have vector clustering analysis 
involve multiple datasets in order to gain the benefit of information analysis from 
multiple sources. 

24. Regarding dependent claim 21, it is a method that modifies claim 20 in the same 
manner that claim 6 modifies claim 1 and is rejected under similar rationale. 

25. Regarding dependent claim 22, it is a method that modifies claim 20 in the same 
manner that claim 7 modifies claim 1 and is rejected under similar rationale. 
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26. Regarding independent claim 23, it is a program device embodying instruction to 
perform a method that is equivalent to Claim 1 and is rejected under similar rationale. 

27. Regarding dependent claim 24, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 2 and is rejected under similar rationale. 

28. Regarding dependent claim 25, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 3 and is rejected under similar rationale. 

29. Regarding dependent claim 26, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 4 and is rejected under similar rationale. 

30. Regarding dependent claim 27, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 5 and is rejected under similar rationale. 

3 1 . Regarding dependent claim 28, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 6 and is rejected under similar rationale. 

32. Regarding dependent claim 29, it is a program device embodying instruction to perform 
a method that is equivalent to Claim 7 and is rejected under similar rationale. 

Response to Amendment 

33. Applicant's arguments filed 6/22/2004 have been fully considered but they are not 
persuasive. 

34. Applicant's traversal of the rejection is based on the position that the applied prior art 
does not teach "clustering second documents in a second dataset using said centroid 
seeds". However, Ruocco in combination in Lantrip as outlined in the Office Action 
produces this effect, as noted in the rejection of Claim 1, because it bases the clustering 
off of prior categorization efforts. Applicant considers there to be a problem because 
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there is no second data set mention in Ruocco. The reason why this is the case is because 
the combination of Lantrip and Ruocco reflects a sequential processing. 
35. The Examiner notes that the Applicant has corrected minor typographical matters 
resident in the claims. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. 

USPN 5,999,927 (filing date 12/7/1999)— Tukey et al. 

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jonathan D. Schlaifer whose telephone number is (571) 272- 
4129. The examiner can normally be reached on 8:30-5:00, M-F. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen Hong can be reached on (571) 272-4124. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
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